Automated Event Coding Using Conditional Random Fields

نویسندگان

  • Adam Stepinski
  • Richard Stoll
  • Devika Subramanian
چکیده

We present an approach using conditional random fields (CRFs) for extracting and coding political events from newswire stories. Coding an event from a news story requires the extraction of the actor and target, the event itself, and the date of occurrence. Actors and targets are political entities. Events are classified into 22 discrete categories in the popular WEIS scheme (Tomlinson 1993). Lead sentences of newswire stories are surprisingly complex in structure and we demonstrate the importance of segmenting it into its constituent phrases as a pre-processing step. We design a CRF model that labels each word in a phrase as being part of the actor, target or a specific event type. Using two hundred sentences drawn from Reuters, we compare the performance of our CRF coder against TABARI (Schrodt 2001), an automated event coder in active use in the political science community. Our comparison focuses on two important WEIS event categories (force (22) and comment (02)). We demonstrate that on the difficult to code force category our CRF coder performs with an accuracy of 72%, recall of 70% and precision of 91%. In contrast, TABARI performs with an accuracy of 22%, recall of 7% and precision of 50%. We explain the sources of power in the CRF model and conclude by describing extensions to our model to code events in all 22 WEIS categories.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning to Recognize Complex Actions Using Conditional Random Fields

Surveillance systems that operate continuously generate large volumes of data. One such system is described here, continuously tracking and storing observations taken from multiple stereo systems. Automated event recognition is one way of annotating track databases for faster search and retrieval. Recognition of complex events in such data sets often requires context for successful disambiguati...

متن کامل

DETCIC: Detection of Elongated Touching Cells with Inhomogeneous Illumination using a Stack of Conditional Random Fields

Automated detection of touching cells in images with inhomogeneous illumination is a challenging problem. A detection framework using a stack of two conditional random fields is proposed to detect touching elongated cells in scanning electron microscopy images with inhomogeneous illumination. The first conditional random field employs shading information to segment the cells where the effect of...

متن کامل

Conditional Random Fields for Airborne Lidar Point Cloud Classification in Urban Area

Over the past decades, urban growth has been known as a worldwide phenomenon that includes widening process and expanding pattern. While the cities are changing rapidly, their quantitative analysis as well as decision making in urban planning can benefit from two-dimensional (2D) and three-dimensional (3D) digital models. The recent developments in imaging and non-imaging sensor technologies, s...

متن کامل

Blog Comments Classification using Tree Structured Conditional Random Fields

The Internet provides a variety of ways for people to easily share, socialize, and interact with each other. One of the most popular platforms is the online blog. This causes a vast amount of new text data in the form of blog comments and opinions about news, events and products being generated everyday. However, not all comments have equal quality. Informative or high quality comments have gre...

متن کامل

Detecting Informative Blog Comments using Tree Structured Conditional Random Fields

The Internet provides a variety of ways for people to easily share, socialize, and interact with each other. One of the most popular platforms is the online blog. This causes a vast amount of new text data in the form of blog comments and opinions about news, events and products being generated everyday. However, not all comments are informative. Informative or high quality comments have great ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006